Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data
نویسندگان
چکیده
Automatic transcription of telephone speech involves additional challenges compared to wideband data processing, mainly due to channel limitations and to particular characteristics of conversational telephone speech. While in TV speech recognition applications, such as automatic transcription of broadcast news, the presence of telephone data is nearly insignificant (less than 1 %), in most radio broadcast stations the presence of telephone speech grows significantly. Thus, transcription of telephone speech data deserves special attention in radio broadcast applications. In this work, we describe our initial efforts to tackle this particular problem. First, a telephone channel classifier is proposed to automatically detect telephone segments. Then, some strategies for increasing robustness of the automatic transcription system are in-
منابع مشابه
Advances in automatic transcription of Italian broadcast news
This paper presents some recent improvements in automatic transcription of Italian broadcast news obtained at ITCirst. A first preliminary activity was carried out in order to develop a suitable speech corpus for the Italian language. The resulting corpus, formed by recordings covering 30 hours of radio news, was exploited for developing a baseline system for transcription of broadcast news. Th...
متن کاملTranscription of New Speaking Styles - Voicemail
In this paper we describe a new testbed for developing speech recognition algorithms a VoiceMail transcription task, analogous to other tasks such as the Switchboard, CallHome [1] and the Hub 4 tasks [2] which are currently used by speech recognition researchers. Spontaneous speech occurring in day-today life can broadly be classi ed into two categories (i) where the speaker does not receive an...
متن کاملFOR S ENTENCE U NIT S EGMENTATION FROM S PEECH Sébastien Cuendet
The sentence segmentation task is a classification task that aims at inserting sentence boundaries in a sequence of words. One of the applications of sentence segmentation is to detect the sentence boundaries in the sequence of words that is output by an automatic speech recognition system (ASR). The purpose of correctly finding the sentence boundaries in ASR transcriptions is to make it possib...
متن کاملRecent advances in transcribing television and radio broadcasts
Transcription of broadcast news shows (radio and television) is a major step in developing automatic tools for indexation and retrieval of the vast amounts of information generated on a daily basis. Broadcast shows are challenging to transcribe as they consist of a continuous data stream with segments of different linguistic and acoustic natures. Transcribing such data requires addressing two m...
متن کاملAutomatic transcription of general audio data: preliminary analyses
The task of automatically transcribing general audio data is very different from the transcription task typically required of current automatic speech recognition systems. The general goal of this work is to quantify the difficult issues posed by such data, thus leading to an understanding of how a speech recognition system may have to be altered to accommodate the added complexities. Specifica...
متن کامل